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Abstract. We show how definite extended logic programs can be used for 
defining and reason with rough sets. Moreover, a rough-set-specific query 
language is presented and an answering algorithm is outlined. Thus, we 
not only show a possible application of a paraconsistent logic to the field 
of rough sets as we also establish a link between rough set theory and 
logic programming, making possible transfer of expertise between both 
fields. 



1 Introduction 

This paper shows how the formalism of rough sets [^|j9|^l used for processing of 
uncertain and contradictory data relates to paraconsistent logic programming B . 
This gives a basis for efficient implementation of rough sets in logic programming. 

Since mid-eighties rough sets have been a subject of intensive research. The 
literature on rough sets includes both theoretical studies and reports on appli- 
cations (for more information and bibliography see the home page of the Rough 
Set Society http://www.roughsets.org). 

A rough set is usually defined by a decision table which can be seen as a finite 
collection of ground positive and negative datalog facts. The table may include 
both a positive fact and its negation, thus it may represent inconsistent informa- 
tion. The intuition is that a decision table defines a set, and the inconsistent facts 
identify elements with uncertain membership. A table may also include multiple 
occurrences of facts. This makes it possible to introduce a quantitative measure 
for membership. One of the main concepts is that of the lower approximation 
of a rough set, corresponding to the elements that the decision table asserts as 
being only positive examples. 

Viewing rows of a decision table as datalog facts gives a basis for extending 
rough sets to Rough Datalog. In our previous work |7| , we proposed such an ex- 
tension. Rough datalog makes it possible to define rough sets not only explicitly 
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as collections of facts (as the decision tables do) but also implicitly by rules. The 
fixpoint semantics of rough datalog links the predicates of a program to rough 
relations. However, having predicates denoting rough relations rather than re- 
lations may cause some difficulty in understanding the rules. Furthermore, the 
intuition of a rule as a definition of a rough set is quite complex, since it has 
to define both positive facts and negative facts of the defined rough set. Finally, 
compilation of datalog rules to Prolog, described in Q], may cause explosion of 
the number of Prolog clauses necessary to deal with negative facts. 

In this paper we propose a simplified approach based on the concept of def- 
inite extended logic program (DXL programs) As mentioned above, decision 
tables for rough sets include explicit negative information. This information can 
be expressed in DXL programs by using explicit negation. Thus, DXL programs 
are well suited to represent rough sets. The fixpoint semantics of DXL deter- 
mines then the rough sets specified by a given program. DXL programs can be 
easily implemented and queried in pure Prolog. However, DXL is not expressive 
enough for stating rough-set-specific queries. For example, in DXL it is not pos- 
sible to query lower approximations of the defined rough sets. To achieve this we 
propose to extend DXL with a query language tailored for rough sets. We also 
show how to obtain answers by transforming the queries to usual Prolog queries. 

The rough sets denoted by the predicates occurring in a program are similar 
to the paraconsistent relations used in . The main aim of the work presented 
in 0] is to introduce an algebraic method to construct the well-founded model 
|| for a general deductive database by using paraconsistent relations associated 
with each predicate symbol of the database. Since the well-founded model is 
always a consistent interpretation, predicates occurring in a general deductive 
database denote crisp setsQ. However, in contrast to [Q, the models of the pro- 
grams proposed in our framework may incorporate contradictions. Consequently, 
while well-founded models are 3- valued, we use models in a 4- valued logic. More- 
over, we deal with explicit negation. 

The rest of the paper is organized as follows. Section || surveys some basic 
concepts of rough sets. Section |3j summarizes the semantics of DXL programs and 
gives an example of how rough sets can be represented via DXL programs. Sec- 
tion |] discusses a rough-set-specific query language and proposes an algorithm 
to obtain answers. Section || gives some conclusions. 

2 Rough sets 

This section gives a brief introduction to Rough Sets. 

We want to deal with the situation where there are conflicting judgments 
about classification of a given object. For example, two patients show identical 
results of clinical tests but one of them has a certain disease and the other does 
not have it, or the experts looking at the medical record of a patient may disagree 
on the diagnosis. The concept of rough set makes possible to express such a 
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situation. More precisely, the situation can be described as follows. We have a 
universe of objects, each of them characterized by a tuple of attribute values 
and by a decision attribute classifying the object. For simplicity, we assume two- 
valued (say "yes" or "no" ) classification. This can be seen as a definition of a set 
consisting of all objects with the decision attribute "yes". However, objects with 
identical attribute values may have different values of the decision attribute. 
Since we only access objects by their attribute values, the double classification 
describes the boundary region, where we cannot be sure whether the object 
belongs to the defined set or not. Thus intuitively, a rough set S is denned by 
indicating the elements of a universe which belong to S and elements which do 
not belong S, while these two categories need not be disjoint. Usually, it is also 
assumed that the union of these categories covers the universe. In practice this 
may be achieved by the assumption that the elements which do not appear in 
the decision table are implicitly classified as not belonging to the defined set. In 
this paper we do not make this assumption. This makes it possible to distinguish 
between the tuples whose membership in S is explicitly negated and those for 
which we have no membership evidence. This distinction is well known in the 
field of logic programming, while it seems not to be discussed in the context 
of rough sets. Notice that our assumption does not exclude the possibility that 
the union of both categories covers the universe, and thus generalizes the usual 
approach. 

Example 1. The following table contains patient records with the symptom at- 
tributes temperature, cough, headache, muscle-pain and the diagnosis done by a 
doctor which says whether or not the patient has flu. The table defines a rough 
set, since it includes different diagnoses for some cases with identical symptoms. 



temp 


cough 


headache 


muscle pain 


flu 
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no 


no 


no 


no 


subfev 


no 


yes 


yes 


no 
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no 


yes 


yes 


yes 
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yes 


no 


no 


no 
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yes 


no 


no 
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no 


no 
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yes 
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no 
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no 
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high 


yes 


yes 


yes 
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The intuitions discussed above can be formalized by the following definitions. 

An attribute a is a function a : U — > V a , where U is a universe of objects. 
The set V a is called the value domain of a. 

We assume that tuples of values provide the only way of referring to objects. 
Two objects are indiscernible with respect to a selected set of attributes, if both 
have the same values for these attributes. Clearly, the indiscernibility relation is 
an equivalence on objects and its equivalence classes are sets of objects which are 
characterized by identical tuples of attribute values. We assume that the tuples 
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provide the only access to objects. Hence, the technical definitions are expressed 
in terms of tuples. As illustrated by the table above, we specify a rough set S 
by classifying tuples of attribute values as positive or negative examples. 

Definition 1. A rough set S is a pair (S + , S~) such that S + , S~ C V ai x • • • x 
V a „ , for some non empty set of attributes {ai, • • ■ , a n }. 

The components S + and S~ will be called the positive region (or the pos- 
itive information) and the negative region (or the negative information) of S, 
respectively. 

We will also use the following notion. 

Definition 2. A rough complement of a rough set S — (S + , S~) is the rough 
set -,S = (S-,S+). 

Using rough set terminology, given a rough set S = (S + ,S~), the sets S + 
and (S + — S~) correspond to the upper approximation and the lower approxi- 
mation of S, respectively. Thus, the approximations of ->S are: S~ (the upper 
approximation) and (S~ — S + ) (the lower approximation). The set S + f] S~ is 
called the boundary (region) of S. Intuitively, the lower approximation of S (-*S) 
refers to the elements that can certainly be classified as (not) members of S. 
The elements in the boundary may belong to S (^S), but we cannot be sure. It 
is easy to see that both the lower approximation and boundary of S are subsets 
of the upper approximation of S. 

The following definition, adopted from || formalizes the idea of the decision 
system (also called decision table) used to define rough sets. 

Definition 3. A (binary) decision system is a pair T> — (U, AU {d}), where U 
is a universe of objects, and A U {d} is a non-empty finite set of attributes, such 
that d : U — > {true, false}. We allow that for some u € U all attribute values, 
including the value of d, are undefined. 

For a given u £ U and set of attributes A = {a\, ■ ■ ■ , a„}, we denote by A(u) 
the tuple (ai(u), • • • , a n (u)). Recall that A may be undefined for some u. Thus, 
A is a partial function on objects. 

Definition 4. A rough set D specified by a decision system T> = (U, A U {d}) 
is a pair (D + ,D~), where 

D + = {A(u) \ ueU and d(u) = true}, 
D~ = {A(u) \ ueU and d(u) = false}. 

Example 2. Consider the rough set Flu specified by the decision system of Ex- 
ample |l|. 

It is easy to check that the lower approximation of Flu is the singleton 
{(high, yes, yes, yes)}. The set {(normal, no, no, no), (high, no, no, no)} is the 
lower approximation of the rough set — iFlu. The boundary region of Flu consists 
of all other remaining tuples in the decision table. 
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A binary decision system can be equivalently represented by a set of literals. 
We illustrate the idea on the decision table of Example |l|. We assume flu to be 
a 4-ary predicate letter. Each row of the table is then represented by a literal 
with the argument values stated in the row. The literal is positive if the decision 
attribute's value is "yes" and negative otherwise. Thus, we obtain the set of 
literals: 

{-if lu (normal ,no ,no ,no), -if lu(subf ev,no ,yes ,yes), 
f lu(subf ev, no, yes, yes), • • • } . 

3 Definite Extended Logic Programs 

This section recalls the concept of Definite Extended Logic Programs and relates 
them to rough sets. Definite extended logic programs extend classical definite 
logic programs with explicit negation. Similar ideas were discussed by many 
authors, see e.g. [^| Jlf]Jll| , |l^| . We follow here the presentation of the survey paper 
1- 

As discussed above, a rough set S can be defined by providing explicitly a set 
of literals with the same predicate letter. The positive literals (e.g. s(ti, • • • , i„)) 
identify the tuples in the positive region of S, while the negative literals (e.g. 
-is(ii, • ■ • , t n )) determine its negative region. This can be seen as an alternative 
representation of a decision system. 

Definite Extended Logic Programs provide a more general way of defining 
sets of literals. 

Definition 5. Q A definite extended logic program (DXL program) is a set of 
rules of the form 

H : —Bi, ■ ■ ■ ,B n . (n>0) 
where H,B\, -- ,B n are literals. 

Notice that rules extend definite clauses by allowing negative literals, both 
in the head and in the body. In the sequel, the rules with empty bodies (facts) 
will be written in the form H. . 

The semantics of DXL programs is defined by viewing each negated literal 
-<p(ti, • • • , i„) as a positive literal p~(ti, ■ ■ ■ , t n ), with a new predicate symbol 
p~ . In this way, a DXL program V is transformed into a definite program V' . The 
standard least Herbrand model semantics Ai-p> of V is a set of ground atoms, 
over the original and the new predicate symbols. The semantics of the DXL 
program V, Ad-p, is defined by replacing each atom of the form p~ (t\, • • • , t n ) G 
M.-pi by the corresponding negative literal ~>p(ti, ■ ■ ■ ,t n ). 

Clearly, in general M.-p may include an atom together with its negation. 
Thus, a DXL program V may introduce inconsistencies. This is what is needed 
to be able to define rough sets. Each predicate symbol p, with arity n > 0, 
occurring in V denotes the rough relation (set) 

P = ({(ti,-- - ,t n )\p(h,--- ,t n ) G M v },{(t u --- ,t„)hp(ti,-- - ,t n ) GMp}) . 
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For a model theoretic semantics for DXL programs based on the four-valued 
Belnap's logic the reader is referred to Q. 

We now show an example of a definition of rough sets by a DXL program. 

Example 3. We consider the rough relation Flu of Example and a rough 
relation Patient with the same attributes as Flu extended with the new ones: 
identification, age and sex. Intuitively, the universe of relation Patient is a set of 
people who visited a doctor. Its decision attribute shows whether a person has 
to be treated for some disease and, therefore, has to be considered a patient. The 
decision may be made independently by more than one expert. All decisions are 
recorded, what might make the relation rough. The example relation is defined 
by the following decision table. 



id 


age 


sex 


temp 


cough 


headache 


muscle pain 


patient 


1 


21 
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normal 


no 


no 


no 


no 


2 


51 
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subfev 


no 


yes 


yes 


yes 


3 


18 


f 


subfev 


no 


yes 


yes 


no 


3 


18 


f 


subfev 


no 


yes 


yes 


yes 


4 


18 


m 


high 


yes 


yes 


yes 


yes 



In order to know who are the people to be treated for flu, we define a new 
rough set Ft. Intuitively, these are people possibly qualified as patients, who may 
have flu according to the decision table of Example [l]. We may also state that the 
people not treated for flu are those not qualified as patients; or those qualified 
as patients who may not have flu. 

This can be expressed as the following DXL program V . 

ft(Id) :- patient(Id,Age,Sex,Fev,C,Ha,Mp) , 
flu(Fev,C,Ha,Mp) . 

-ift(Id) : ipatient(Id,Age,Sex,Fev,C,Ha,Mp) . 

-ift(Id) :- patient(Id,Age,Sex,Fev,C,Ha,Mp) , 
-.flu(Fev,C,Ha,Mp) . 

As explained above, the semantics of this program determines the rough 
relation Ft. Thus, we can conclude that person 4 is definitely qualified for flu 
treatment (i.e. belongs to the lower approximation of Ft). Persons 2 and 3 may 
or may not be treated for flu (i.e. belong to the boundary of Ft) and person 1 is 
not certainly qualified for flue treatment (i.e. belongs to the lower approximation 
of -.Ft). 

4 Rough Set Queries 

The transformed version V' of a DXL program V, defined above, may be used 
by a Prolog system for answering queries about rough sets. Notwithstanding 
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the incompleteness of Prolog we conclude, that whenever the query evaluation 
terminates and succeeds, we obtain an answer showing an instance of the query 
consisting of the elements of the least model. 

Other systems exist that can answer queries w.r.t to a normal program, for 
instance, XS-B-Prolog (for more details see http://xsb.sourceforge.net/). 
Hence, also those systems could be used to implement our query answering 
algorithm. 

4.1 A Query Language for Rough Sets 

Since the proposed query answering technique refers to the least model of the 
transformed program, in the terminology of rough sets the answer concerns the 
upper approximations of the defined rough sets. For example, consider program 
V of Example H the answer yes to the query ?(V, ft (4)) means that person 
4 belongs to the upper approximation of the rough set Ft. However, it may also 
be important to check whether a given element is in the lower approximation 
of a rough set, or what are the elements in the boundary region of a given set. 
Thus, we propose to extend DXL with the following rough set specific queries. 

Definition 6. A rough query Q is a pair q) , where V is a DXL program 
and q is defined by the following abstract syntax rules 

q — ► q' | al_ 

q' — ► I \ l\ a | q[,q2 , 

where / is a literal and a is an atom. 

Let Q = ?(V, q) be a query, given a DXL program V. Then, Q is a simple 
query if q is a literal I, or of the form I or a, where a is an atom. A composite 
query is a sequence of simple queries, separated by commas. A composite query 
is interpreted as a conjunction of simple queries. 

Let V be a DXL program and R be the rough relation denoted by predicate 
r of V . First, we explain intuitively how the answer to a ground simple query 
can be obtained. The answer to a ground simple query may only be yes or no. 

- The answer to a query ?(T>, r(ti, • • ■ ,t n )) ( ?(V, ->r(ti, ■ ■ ■ ,t n ))) is yes iff 
the tuple (ti, ■ ■ ■ ,t n ) belongs to the positive region (negative region) of the 
rough relation R, defined by V . Otherwise, the answer is no. 

- The answer to a query ?(V,r(ti, ■ ■ ■ ,t n )) ( ~^r(U ,t n ))) is yes iff 
the tuple (ti, ■ • ■ ,t n ) belongs to the lower approximation of R (~|R). Other- 
wise, the answer is no. 

- The answer to a query ?('P,r(ii, • • • ,t n )) is yes iff the tuple (t\, ■ ■ ■ ,t n ) 
belongs to the boundary region of R. Otherwise, the answer is no. 

The ground query ?(V, r(ti, • ■ • , t n )T) questions what is known about atom 
r(ti, ■ ■ ■ , t n ) in the least model of V . Four cases are possible. Tuple (ii, • • • , t n ) 
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may belong to the boundary region of the denoted rough set R, to its lower ap- 
proximation, to the lower approximation of -iR, or to none of these. The respec- 
tive answers will be: T, yes, no, and _L. Notice that T represents the existence 
of contradictory information and _L represents absence of information. Although 
this kind of queries are not strictly needed because the same information can be 
obtained with several simple queries, they might be useful in practice. For in- 
stance, for a given n-ary predicate r, the query ?(V, r{X\, • • • , X n )l) classifies 
all possible n-ary tuples with respect to the membership of the rough relation 
denoted by r. 

A natural extension to non-ground simple queries ?(V,q) case (i.e. q contains 
some variables) gives as answer the set of all valuations 9 for which the query 
instance ?(V,0(q)) satisfies the above mentioned conditions. The answer no 
represents the empty set of valuations, and the answer yes corresponds to the 
set of all ground valuations of the variables of the query. 

The answer to a query of the form ?(V, r(t\, • • • , t n )7) , where r(t\, • • • , t n ) 
is a non ground atom, is a triple of sets (Ai,A2, A3): set A\ corresponds to the 
instances of the query that belong to the boundary of R; set A2 corresponds to 
the instances of the query that belong to the lower approximation of R; set A3 
corresponds to the instances of the query that belong to the lower approximation 
of -iR. Obviously, answers to this type of queries can be obtained by issuing the 
simple queries ?(V,7(ti, ■ ■ ■ ,<„)) , ?("P,r(ii, ■ • ■ ,t„)) and ?("P,z!Z:(ti, ■ ■ ■ ,t n )) . 

The above ideas can be easily extended to the case of composite queries. Note 
that a query of the form ?{V , q?) cannot be involved in a composite query. 

Example 4- Consider the program V of Example defining the rough relation 
Ft. We may pose queries like ?("P,ft(3)) , ?(P,ft(X)) or ?(V, -.ft (4)) . The 
obtained answer would then be: yes for the first query; {X=2,X=3} for the second 
one; and no for the last one. 



4.2 Implementing Rough Queries in Prolog 

As already discussed the simple literal queries for a DXL program V can be 
directly answered in Prolog by using the transformed version V' of V '. 

We now show how the remaining queries can also be answered by transform- 
ing them to Prolog queries for P' . We define the following transformation r of 
simple queries to Prolog queries, where not denotes Prolog negation as failure 
(\+). 



' q(h, ■■■ ,t n ) 

q~(h, ■■■ ,t n ) 
t(Q) = < q(h, ■ ■ ■ ,t n ), not q~{h, 
q~(ti, ■ ■ ■ ,tn), not q(ti, 



if Q = q(ti, ■■■ ,t n ) 
if Q = ->q(ii, • • ■ ,t n ) 
,t n ) if Q = q(ti, ■ ■■ ,t n ) 
,t„) if Q = -.gfti, • • • ,t n ) 
if Q = q(h, ■ ■■ ,t n ) 



Let V be a DXL program. We now claim that the answers obtained by Prolog 
evaluation of the query t(Q) w.r.t to the program V' coincide with the answers 
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defined for ?(V,Q) , in Section 4.1. Let p be a predicate letter occurring in V . 
By the construction of V, it follows that an atom p(t\, ■ ■ ■ ,t n ) belongs to Aip 
iff it also belongs to A4p>. Moreover, a negative literal ->p(tx,--' ,t n ) £ M.-p 
iff the atom p~(t%, ■ ■ ■ , t n ) 6 Al-p<. Recall that V' is a definite program. If a 
simple query q(t±, • • • , t n ) or q~(t\, ■ ■ ■ , i„) w.r.t "P' fails in Prolog, then it has 
no (ground) instances in A4p> . In view of that, it can be easily checked that 
each of the five cases of the definition of r satisfy our claim. Take for example 
a lower approximation rough query ?(V, q{t\, ■ ■ ■ , t n )) . Assume that the Prolog 
answer to r(q(tx, ■ ■ ■ ,£„)) w.r.t V' returns a valuation 8. Thus, 9(q(t±, ■ ■ ■ , tn)) 
is in Ai-pt, hence in Aip. On the other hand, if 8(q~(tx,--- ,i„)) fails then 
8(q~(ti, ■ ■ ■ ,t n )) ^ Aip>. Thus, 8(^q(ti, ■ ■ ■ ,t n j) is not in M-p. Consequently, 
8(q(ti, • • ■ , t n )) belongs to the lower approximation of the rough set Q denoted 
by predicate g, as required. One should also consider the case when the Prolog 
query t(Q) fails w.r.t V' . This means that q(t\,--- ,i„) fails (i.e. there is no 
instance of q(ti, • • • , t n ) that belongs to the upper approximation of Q) or that 
whenever q(t\,--- ,t n ) succeeds with a valuation 8 then not 6(q~~ {t\, ■ ■ ■ ,t n )) 
fails (i.e. 8(^q(ti, ■ ■ ■ ,t n )) is in the negative region of Q). Thus, in both cases 
there is no instance of the query which belongs to the lower approximation of 
the rough relation Q. 



5 Discussion and Conclusions 

The contribution of the paper is twofold. First, it establishes a link between logic 
programming and rough set theory that makes possible to combine techniques 
originating from both fields. Second, we show an application of the techniques 
developed in the area of paraconsistent logic. 

We relate DXL programs to rough sets: we have shown that the least model 
of any DXL program can be seen as a family of rough relations. Although this 
observation is technically very straightforward, it opens for use of Prolog for 
defining and manipulation of rough sets. To our knowledge this approach is 
novel as concerns rough sets. It improves and simplifies our recent work on rough 
datalog [Q , by providing more flexible technique for defining negative regions of 
rough sets, which results in simplification of the semantics. 

The language of rough queries brings the specificity of rough sets to paracon- 
sistent logic programming. It should be clear that with this language, mainly due 
to the use of lower approximations, we implicitly introduce a very restricted form 
of default negation into DXL. A natural question is whether the lower approx- 
imations should be introduced into bodies. There may be example applications 
such that the reference to lower approximations in the rules may be desirable. 
However, so far the interest of rough sets community for nonmonotonic reasoning 
seems to be rather limited. 

Extension of the language with lower approximations in the body would 
require a more sophisticated semantics. However, such an extension would still 
not allow a free use of default negation. An interesting question is then whether 
these restrictions make it possible to provide a simple and intuitive semantics. 
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